Thomson Legal and Regulatory at NTCIR-5: Japanese and Korean Experiments

نویسندگان

  • Isabelle Moulinier
  • Ken Williams
چکیده

Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-5 workshop. We submitted formal runs for monolingual retrieval in Japanese and Korean, as well as for bilingual English-to-Japanese retrieval. We employed enhanced tokenization for our Japanese and Korean runs and applied a novel selective pseudo-relevance feedback scheme for Japanese. Our bilingual search participation was a straightforward application of an off-the-shelf Machine Translation system to transform an English query into a Japanese query. Unfortunately we cannot draw many conclusions from our participation, as our experiments were hampered by technical difficulties, particularly with our tokenization and stemming components.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Thomson Legal and Regulatory at NTCIR-4: Monolingual and Pivot-Language Retrieval Experiments

Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-4 workshop. We submitted formal runs for monolingual retrieval in Japanese, Chinese and Korean. Our bilingual runs from Chinese and Korean to Japanese rely on English as a pivot language. During our monolingual experiments, we compared building stopword lists using query logs to building stopword lists from collection stati...

متن کامل

Thomson Legal and Regulatory at NTCIR-3: Japanese, Chinese and English Retrieval Experiments

Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-3 workshop. We submitted formal runs for monolingual retrieval in Japanese and Chinese, and for bilingual retrieval from English to Japanese. Our main focus was in Japanese retrieval. We compared word-based and character-based indexing, as well as query formulation using characters and character bigrams. Our results show th...

متن کامل

POSTECH at NTCIR-5

This paper describes methodologies for NTCIR-5 CLIR involving Korean and Japanese, and reports the official result as well as retrieval results using NTCIR-3 and NTCIR-4 data. We participated in four tasks: K-K and J-J monolingual tracks and K-J and J-K cross-lingual tracks. Unlike English, in Asian languages such as Korean and Japanese term extraction is nontrivial because of segmentation ambi...

متن کامل

CJK Experiments with Hummingbird SearchServerTM at NTCIR-5

Hummingbird submitted ranked result sets for the Chinese, Japanese and Korean Single Language Information Retrieval subtasks of the Cross-Lingual Information Retrieval Task of the 5th NII-NACSIS Test Collection for IR Systems Workshop (NTCIR-5). For short Chinese (title) queries, a decompounded wordbased approach produced higher (statistically significant) mean average precision and first relev...

متن کامل

NTCIR-5 CLIR Experiments at Oki

We participated in the SLIR, BLIR(PLIR) and MLIR subtasks of the NTCIR-5 CLIR task. Our IR system uses language models for document scoring and query expansion, and can handle four languages; Chinese, Japanese, Korean and English. The system utilizes multiple language resources (bilingual dictionaries, parallel corpora and machine translation systems). We attempted to use some techniques includ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005